# Cascaded Diffusion Model
I2vgen Xl
MIT
An open-source video synthesis codebase developed by Alibaba's Tongyi Lab, integrating multiple advanced video generation models
Text-to-Video
I
ali-vilab
4,252
172
Show 1 Sr2
Show-1 is an efficient text-to-video generation model that combines the advantages of pixel and latent space diffusion models, capable of producing high-quality videos with precise text alignment.
Video Processing
S
showlab
127
10
Show 1 Interpolation
Show-1 is an efficient text-to-video generation model that combines the strengths of pixel and latent space diffusion models to produce high-quality videos that precisely match the input text.
Video Processing
S
showlab
163
3
IF I L V1.0
DeepFloyd-IF is a pixel-based three-stage cascaded diffusion model that achieves unprecedented levels of photorealism and language understanding. Its efficiency surpasses current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset.
Text-to-Image
I
DeepFloyd
4,299
20
Featured Recommended AI Models